首页> 外文OA文献 >Learning Deep Control Policies for Autonomous Aerial Vehicles with MPC-Guided Policy Search

【2h】

Learning Deep Control Policies for Autonomous Aerial Vehicles with MPC-Guided Policy Search

机译：学习自主飞行器的深度控制策略 mpC指导的政策检索

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Model predictive control (MPC) is an effective method for controlling roboticsystems, particularly autonomous aerial vehicles such as quadcopters. However,application of MPC can be computationally demanding, and typically requiresestimating the state of the system, which can be challenging in complex,unstructured environments. Reinforcement learning can in principle forego theneed for explicit state estimation and acquire a policy that directly mapssensor readings to actions, but is difficult to apply to unstable systems thatare liable to fail catastrophically during training before an effective policyhas been found. We propose to combine MPC with reinforcement learning in theframework of guided policy search, where MPC is used to generate data attraining time, under full state observations provided by an instrumentedtraining environment. This data is used to train a deep neural network policy,which is allowed to access only the raw observations from the vehicle's onboardsensors. After training, the neural network policy can successfully control therobot without knowledge of the full state, and at a fraction of thecomputational cost of MPC. We evaluate our method by learning obstacleavoidance policies for a simulated quadrotor, using simulated onboard sensorsand no explicit state estimation at test time.

机译：模型预测控制（MPC）是控制机器人系统（尤其是自动飞行器，如四轴飞行器）的有效方法。但是，MPC的应用可能需要进行计算，并且通常需要估计系统的状态，这在复杂的非结构化环境中可能具有挑战性。原则上，强化学习可以放弃显式状态估计所需的知识，并且可以获取将传感器读数直接映射到动作的策略，但是很难应用于不稳定的系统，这些系统在找到有效的策略之前可能会在训练期间遭受灾难性的失败。我们建议在指导性策略搜索的框架中将MPC与强化学习相结合，其中在仪器化培训环境提供的完整状态观察下，MPC用于在培训时间生成数据。该数据用于训练深度神经网络策略，该策略仅允许访问来自车辆车载传感器的原始观测值。训练后，神经网络策略可以在不了解完整状态的情况下成功控制机器人，而其成本仅为MPC的一小部分。我们通过使用模拟机载传感器学习模拟四旋翼的避障策略来评估我们的方法，并且在测试时没有明确的状态估计。

著录项

作者
Zhang, Tianhao; Kahn, Gregory; Levine, Sergey; Abbeel, Pieter;
展开▼
作者单位

展开▼
年度 2016
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Deep reinforcement-learning-based driving policy for autonomous road vehicles [J] . Intelligent Transport Systems, IET . 2020,第1期

机译：基于深度强化学习的自动驾驶道路驾驶策略
2. Proximal Policy Optimization Through a Deep Reinforcement Learning Framework for Multiple Autonomous Vehicles at a Non-Signalized Intersection [J] . Duy Quang Tran, Sang-Hoon Bae Applied Sciences . 2020,第16期

机译：在非信号交叉路口的多个自治车辆深度加强学习框架近端政策优化
3. Mapless Motion Planning System for an Autonomous Underwater Vehicle Using Policy Gradient-based Deep Reinforcement Learning [J] . Sun Yushan, Cheng Junhan, Zhang Guocheng, Journal of Intelligent & Robotic Systems: Theory & Application . 2019,第3a4期

机译：基于政策梯度的深度加固学习的自主水下车辆的茂盛运动规划系统
4. Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search [C] . Tianhao Zhang, Gregory Kahn, Sergey Levine, IEEE International Conference on Robotics and Automation . 2016

机译：通过MPC指导的策略搜索来学习自动驾驶飞机的深度控制策略
5. Machine Learning Based PID Controller Application to Autonomous Unmanned Aerial Vehicles (UAV). [D] . Ogworonjo, Henry C. 2011

机译：基于机器学习的PID控制器在无人飞行器（UAV）中的应用。
6. Autonomous Unmanned Aerial Vehicles in Search and Rescue Missions Using Real-Time Cooperative Model Predictive Control [O] . Fabio Augusto de Alcantara Andrade, Anthony Reinier Hovenburg, Luciano Netto de Lima, 2019

机译：实时协作模型预测控制在搜索和救援任务中的自主无人机
7. Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control [O] . Zhuo Xu, Chen Tang, Masayoshi Tomizuka 2018

机译：基于鲁棒控制的自主车辆零击零钢筋学习驾驶政策转移

Learning Deep Control Policies for Autonomous Aerial Vehicles with MPC-Guided Policy Search

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅